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Summary rel 

Reducing stereotype threat in classrooms: 
a review of social-psychological 
intervention studies on improving 
the achievement of Black students 



Stereotype threat arises from a fear 
among members of a group of reinforc- 
ing negative stereotypes about the intel- 
lectual ability of the group. The report 
identifies three randomized controlled 
trial studies that use classroom-based 
strategies to reduce stereotype threat 
and improve the academic performance 
of Black students, narrowing their 
achievement gap with White students. 

This review located and summarized the 
findings of randomized controlled trial stud- 
ies on classroom-based social-psychological 
interventions aimed at reducing the experi- 
ence of stereotype threat that might otherwise 
lead some Black students to underperform on 
difficult academic tasks or tests. Reducing the 
achievement gap between Black and White 
students is a critical goal for states, districts, 
and schools. Experimental research on both 
inducing and reducing stereotype threat can 
inform discussions of strategies. 

Some students may perform below their 
potential because of the stress of being under 
constant evaluation in the classroom. Black 
students, however, may experience another 
source of stress in addition to this general 
one (which they share with their nonminority 



peers). This second source of stress is specific 
to negatively stereotyped groups. It arises from 
a fear of reinforcing negative stereotypes about 
the intellectual ability of their racial group. 
Because Black students must contend with two 
sources of stress rather than one, their perfor- 
mance may be suppressed relative to that of 
their nonminority peers. 

A systematic search was conducted for em- 
pirical studies of classroom-based social- 
psychological interventions designed to reduce 
stereotype threat and thus improve the aca- 
demic performance of Black students. Search 
term combinations, such as “stereotype threat” 
and “intervention,” and “achievement gap” and 
“intervention,” were used to search a number 
of bibliographic databases. In addition, a web 
site on this topic with an extensive reference list 
was also reviewed. This initial search identified 
289 references. After applying relevant inclu- 
sion criteria for topical and sample relevance, 
three experimental studies were identified. The 
three studies found positive impacts on the 
academic performance of Black students for the 
following social-psychological strategies: 

• Reinforce for students the idea that intel- 
ligence is expandable and, like a muscle, 
grows stronger when worked. 



SUMMARY 



• Teach students that their difficulties in 
school are often part of a normal learning 
curve or adjustment process, rather than 
something unique to them or their racial 
group. 

• Help students reflect on other values in 
their lives beyond school that are sources 
of self-worth for them. 

These three experiments are not an exhaus- 
tive list of the interventions to consider in 
reducing the racial achievement gap, nor are 
they silver bullets for improving the academic 



performance of Black students. Rather, they 
present scientific evidence suggesting that 
such strategies might reduce the level of 
social-psychological threat that some Black 
students might otherwise feel in academic 
performance situations. It is important to 
note that while the strategies use established 
procedures that can be emulated by teach- 
ers and administrators, they also require 
thought and care on the part of schools and 
teachers in applying them in their particular 
situations. 

July 2009 
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Stereotype threat 
arises from a fear 
among members 
of a group of 
reinforcing negative 
stereotypes about 
the intellectual 
ability of the 
group. The report 
identifies three 
randomized 
controlled trial 
studies that use 
classroom-based 
strategies to reduce 
stereotype threat 
and improve 
the academic 
performance of 
Black students, 
narrowing their 
achievement 
gap with White 
students. 



WHY THIS STUDY? 

At every level of family income and school prepa- 
ration, Black students 1 on average earn relatively 
lower grade point averages (GPAs) and scores on 
standardized tests (Bowen and Bok 1998; Hacker 
1995; Jencks and Phillips 1998; Steele 1997). In 
a society where economic opportunity depends 
heavily on scholastic success, even a partial nar- 
rowing of the achievement gap would lead to a 
positive change in the lives of many academically 
at-risk children. 



Need for the study 

Regional Educational Laboratory Southeast serves 
six southeastern states for which reducing the 
achievement gap between Black students and 
White students continues to be a major concern. 
The data indicate an education crisis in the South- 
east Region, especially for Black male students 
(KewalRamani et al. 2007; Wald and Losen 2005). 
A report by the Southern Regional Education 
Board (SREB) on SAT and ACT scores concludes 
that between 1998 and 2002 none of the 16 SREB 
states narrowed the achievement gap between 
Black and White students (Southern Regional 
Education Board 2003). The achievement gap even 
widened for Black male students. Among the SREB 
states, which include the six states covered by the 
Regional Educational Laboratory Southeast, only 
45 percent of Black male students graduated from 
high school in 2003 compared with 61 percent of 
Black female students, 65 percent of White male 
students, and 67 percent of White female students. 

Thus, Regional Educational Laboratory South- 
east frequently receives requests from Southeast 
Region educators for information on new ideas on 
interventions, programs, and policies that could 
close the achievement gap between Black and 
White students. Several Southeast Region states 
have regularly hosted conferences on this topic 
and published reports based on reviews. 

Many potential contributing factors in the 
achievement gap have been explored, some 
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REDUCING STEREOTYPE THREAT IN CLASSROOMS 



Because of an awareness 
of negative stereotypes 
presupposing academic 
inferiority. Black and 
other minority students 
may worry that they could 
confirm the intellectual 
inferiority alleged by 
such stereotypes. Such 
worries can hinder 
their test performance, 
motivation, and learning 



individual (for example, fam- 
ily socioeconomic background, 
self-efficacy, and student aspira- 
tions) and some school related (for 
example, class size, distribution of 
resources across schools, quality 
and diversity of teachers, and lack 
of explicit and high performance 
standards). Other factors external 
to school are also suggested as 
having an impact on racial gaps in 
academic performance, such as a 
lack of high-quality early child- 
hood education and of economic 
opportunities to pursue postsecondary education, 
an important incentive to do well in school. (For 
reviews of research on the achievement gap, see 
Bowen and Bok 1998; Jencks and Phillips 1998; 
Rothstein 2002.) This report recognizes that there 
is a complex set of influential factors and that 
many of them are beyond a teacher’s influence; 
these are not addressed here. Rather, to respond to 
the ongoing need for new information in this area, 
this review located and summarized findings from 
experimental studies on classroom-based social- 
psychological interventions to reduce stereotype 
threat in schools and classrooms that might lead 
some Black students to underperform on difficult 
academic tasks or tests. 2 



What is stereotype threat and how has it been studied? 

What is stereotype threat? Social psychologists 
hypothesize that racial stigma could help explain 
why, on average, Black and White students of 
similar socioeconomic backgrounds perform dif- 
ferently in college and on key standardized tests 
(Steele and Aronson 1995; see also Steele 1997). 

As students progress through school, classroom 
learning environments may become increasingly 
competitive, evaluative in nature, and stressful 
for some minority students. The logic behind 
stereotype threat is that because of an awareness 
of negative stereotypes presupposing academic 
inferiority, Black and other minority students may 
worry that they could confirm the intellectual in- 
feriority alleged by such stereotypes (see appendix 



A for a summary of the research on stereotype 
threat). Such worries, in turn, can hinder their test 
performance, motivation, and learning. 

Research on stereotype threat began with labora- 
tory studies exploring why Black college students 
seemed to be performing below their potential. 
Although a test-taking situation may seem objec- 
tively the same for all students, some students, 
because of their social identity, may experience it 
in a very different way. Steele and Aronson (1995) 
conducted a seminal experiment to explore the 
negative impact of administering a test under 
potentially stereotype-threat-inducing conditions 
by randomly assigning study participants to two 
different test-taking conditions. In one test-taking 
condition, a standardized test (composed of verbal 
Graduate Record Exam items) was presented 
to one group of college students as “diagnostic 
of intellectual ability.” It was hypothesized that 
Black students in this condition would worry that 
performing poorly could confirm a stereotype 
about their racial group’s intellectual ability. 

Black students performed worse in this condition 
than when the same test was given in a second 
condition that introduced the test as one that was 
“not diagnostic of your ability.” The two ways of 
introducing the test had no effect on the perfor- 
mance of White students. Black students in the 
study sample answered roughly 8 of 30 test items 
correctly in the “threat” condition and roughly 12 
of 30 correctly in the “no threat” condition. 

Since the original experimental studies on the 
effects of inducing stereotype threat (Steele and 
Aronson 1995; Steele 1997), there has been an 
explosion of research documenting the negative 
effect of this phenomenon on performance of vari- 
ous types (for reviews see Ryan and Ryan 2005; 
Shapiro and Neuberg 2007; Steele, Spencer, and 
Aronson 2002; Walton and Cohen 2003; Wheeler 
and Petty 2001). Shapiro and Neuberg (2007, 
p. 125), in reviewing this literature, suggest that 

The intellectual excitement surrounding the 
stereotype threat concept and research pro- 
gram stems in large part from the possibility 
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BOX 1 

Study methods 

Search and screening. This study 
began with a thorough search, 
screening, and quality review to iden- 
tify empirical studies of classroom- 
based social-psychological interven- 
tions designed to reduce stereotype 
threat and thus improve the academic 
performance of Black students. In ad- 
dition to literature searches using key 
terms, a web site on this topic with 
an extensive reference list of peer-re- 
viewed journal articles was examined 
(www.reducingstereotypethreat.org). 
The literature search yielded 158 cita- 
tions, and the web site reference list 
yielded an additional 131 citations. 

The 289 references were then 
screened for inclusion using a set of 
six questions (see appendix C for the 
article screening protocol). A total of 
214 studies were excluded based on 
the initial screening, applying the 
first three criteria (see table B2 and 
figure B1 in appendix B for disposi- 
tion details). Studies were excluded 
as off-topic or irrelevant (87); because 
they were literature reviews, book 
chapters, or summary articles 
rather than empirical studies (20); 
or because they focused on gender- 
based stereotype threat (107). The 
remaining 75 references were subject 
to a second round of screening to 
see whether they met the following 
criteria: 

• Studied the effect of a social- 
psychological intervention (rel- 
evant to reducing the intensity 
of the psychological experience 
of stereotype threat) on im- 
provements to student academic 
performance. 



• Included Black students in the 
sample. 

• Included K-12 students as the focus. 

The second round of screening 
excluded 72 studies. Most studies (65) 
were excluded for failing to meet the 
first criterion — they explored vari- 
ous aspects of the negative impact 
of stereotype threat on performance 
rather than studying interventions to 
reduce the intensity of the experience 
of stereotype threat. 

A second, broader verification search 
(using the broadest search term 
“stereotype threat” without the word 
“intervention”) was conducted to 
ensure that relevant studies had not 
been missed. No additional studies 
appropriate for inclusion were found 
among the 741 references identified. 

Assessing the quality of identified inter- 
vention studies. The three remaining 
studies were subject to a final quality 
review to describe any methodologi- 
cal limitations, using a study coding 
protocol (see appendix C) based on 
the five criteria below from the What 
Works Clearinghouse Procedures and 
Standards Handbook (U.S. Depart- 
ment of Education 2008) for assessing 
the internal validity of studies exam- 
ining the effects of interventions: 

• Outcome measures. The measures 
used to assess impact must be 
shown to actually measure what 
they are intended to measure. The 
three studies reported on here 
used appropriate school measures 
of student achievement. 

• Random assignment process. In 
experimental studies researchers 
use random assignment to assign 



participants to experimental con- 
ditions (intervention or control) to 
ensure that the groups are as simi- 
lar as possible on all characteristics 
so that the outcomes measured 
reflect the influence of the inter- 
vention only. Only one study had 
a limitation in this area (Good, 
Aronson, and Inzlicht 2003). 

• Attrition of participants. Loss of 
participants can create differ- 
ences in measured outcomes by 
changing the composition of the 
intervention or control groups. 
Both overall attrition and differen- 
tial attrition (differences between 
intervention and control groups) 
are of concern. All three studies 
were acceptable in this area. 

• Intervention contamination. Inter- 
vention contamination can happen 
when unintended events occur after 
intervention begins that could affect 
group outcomes and therefore the 
conclusions of the experiment. One 
study was noted as having a pos- 
sible limitation in this area (Good, 
Aronson, and Inzlicht 2003). 

• Confounding factor. It is impor- 
tant to examine factors beyond 
the intervention that might affect 
differences between groups, such 
as the effects of teachers or of 
the intervention provider more 
generally. No studies were noted as 
having problems in this area. 

The completed study quality review 
protocols were used in developing the 
final list of limitations reported for 
each of the three studies. 

For further details on the methodol- 
ogy see appendixes B and C. 
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REDUCING STEREOTYPE THREAT IN CLASSROOMS 



that the real-world costs of stereotype threat 
are substantial. ... If one could design effec- 
tive interventions for reducing the experi- 
ence of stereotype threat, then one would 
have a powerful tool for influencing an 
important set of societal problems. 

The experimental manipulations used to study the 
effect of stereotype threat on academic test perfor- 
mance are of two kinds, direct and indirect. The 
direct way of inducing stereotype threat in experi- 
ments has been to tell the test-taking group that 
the test they will take has been sensitive to group 
differences in the past (for example, “this test 
shows racial differences”), thus raising the poten- 
tial relevance of the stereotype as an explanation 
for poor performance. An indirect way of studying 
the negative effects of stereotype threat has been 
to inform students that a test is “diagnostic of your 
ability” (as in Steele and Aronson 1995), convey- 
ing that the test is designed to evaluate students’ 
performance along a stereotype-relevant trait 
(intellectual ability) and consequently bringing to 
the fore concerns about confirming the stereotype. 



study methods). After relevant inclusion criteria 
were applied, three experimental studies were 
identified for description here. Those three studies 
are described in the following section. 



FINDINGS OF THREE EXPERIMENTAL STUDIES 
OF INTERVENTIONS TO REDUCE STEREOTYPE 
THREAT IN GRADE 7 CLASSROOM SETTINGS 

All three studies reported on here found statisti- 
cally significant positive effects of the tested inter- 
ventions on achievement measures. The following 
intervention strategies were tested in the studies 
described in detail below: 

• Reinforce for students the idea that intelli- 
gence is expandable and, like a muscle, grows 
stronger when worked. 

• Teach students that their difficulties in school 
are often part of a normal “learning curve” or 
adjustment process, rather than something 
unique to them or their racial group. 



Intervention strategies 
in the studies described 
reinforced the idea 



that intelligence is 
expandable, taught 
students that difficulties 
in school are often part 
of a normal "learning 
curve," or helped 
students reflect on other 
values in their lives as 
sources of self-worth 



To the extent that stereotype threat might be a 
factor in some Black students experiencing extra 
stress when doing challenging academic work in 
school, what can be done to alleviate this stress and 
possibly improve their performance? Relatively few 
experimental studies have been conducted in class- 
room settings on interventions to explicitly reduce 
the experience of stereotype threat and thus im- 
prove the academic performance 
of Black students. However, some 
recent classroom-based experi- 
mental studies were identified that 
have relevance for educators. 



This study’s search for empirical 
studies of classroom-based social- 
psychological interventions de- 
signed to reduce stereotype threat 
and thus improve the academic 
performance of Black students 
initially identified 289 references 
(see box 1 and appendix B on 



• Help students reflect on other values in their 
lives beyond school that are sources of self- 
worth for them. 

Table 3 at the end of the main report summarizes 
the outcome measures, analytic techniques, and 
the findings across the three studies. (Table B4 in 
appendix B summarizes the methodologies.) 



Study 1: Blackwell, Trzesniewski, and Dweck 
2007, "Implicit theories of intelligence predict 
achievement across an adolescent transition: 
a longitudinal study and an intervention" 

Intervention idea 

• Reinforce for students the idea that intelli- 
gence is expandable and, like a muscle, grows 
stronger when worked. 

There is much research in psychology explor- 
ing the idea that some students can be trained to 
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think more productively about how they approach 
performance challenges. One belief that seems to 
affect how students approach such challenges is 
that intelligence is not fixed but malleable, that 
it can be developed through focus and effort and 
thus that intelligence can be taught (Dweck 1999; 
Whimbey 1975). Indeed, Aronson, Fried, and 
Good (2002) posit that some Black students might 
have developed a stereotype-consistent belief that 
their intellectual ability is “fixed,” causing them 
to feel more negative about academic performance 
situations than they would if they believed that 
their ability could grow with greater focus, ef- 
fort, and creativity in problem-solving strategies. 
Alternatively, such students may feel that others 
see their ability as fixed and thus worry about 
negative inferences being drawn about them based 
on their performance. Thus, reinforcing the idea 
that intellectual ability is malleable and incremen- 
tally developed and that others view it in this way 
indirectly reduces students’ sense of psychological 
threat under challenging academic performance 
situations. 

The first study reports on the effects of an inter- 
vention to teach students to see intelligence as 
incrementally developed rather than fixed. 

Research question. Does teaching students to 
see intelligence as malleable or incrementally 
developed lead to higher motivation and perfor- 
mance relative to not being taught this theory of 
intelligence? 



groups of students can re- 
ceive more individual at- 
tention from a teacher — 
were randomly assigned 
to an intervention or 
control curriculum to 
test the effectiveness of 
teaching students about 
the theory of incremental 
intelligence. Both groups 
received eight weekly 25 -minute sessions begin- 
ning in the spring of grade 7 during their regular 
advisory class period (to which they had been 
assigned at random by the school). 



The Blackwell, 
Trzesniewski, and Dweck 
study reports on the 
effects of an intervention 
to teach students to 
see intelligence as 
incrementally developed 
rather than fixed 



Both intervention and control groups received four 
25 -minute sessions on the brain, the pitfalls of 
stereotyping, and study skills. In four additional 
sessions the intervention group received informa- 
tion that focused on “growing your intelligence” 
and involved reading age- appropriate descriptions 
of neuroscience experiments documenting brain 
growth in response to learning new skills and 
class discussions on how learning makes students 
smarter. The intervention was based on previ- 
ous experimental materials used in studies with 
college students (Aronson, Fried, and Good 2002; 
Chiu, Hong, and Dweck 1997). For these four ses- 
sions the control group received content unrelated 
to the malleability of intelligence and focused 
instead on topics about the brain and memory 
that were unrelated to the incremental theory of 
intelligence. 



Study sample. The study sample included 91 grade 
7 students in an urban public school with low- 
achieving students (52 percent Black, 45 percent 
Hispanic, and 3 percent White and Asian; 79 per- 
cent eligible for free or reduced-price lunch). There 
were 48 students in the intervention group and 43 
in the control group. The two groups did not differ 
significantly in their prior academic achievement 
(fall term math grades) or on any baseline mea- 
sures of motivation. 



The sessions were delivered by 16 trained under- 
graduate assistants, with two undergraduates as- 
signed to each class. To ensure consistent delivery 
of the intervention materials, session leaders 
received reading material and met weekly with the 
research team to review the material and pre- 
pare to present it to their assigned advisory class. 
Intervention and control workshop leaders met 
separately to train to prepare for the four sessions 
with different content. 



What was the intervention? Students in advisory 
classes — periods in the schedule when small 



Results. The researchers first provided results to 
show that their intervention had been successful 
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The intervention 
group in the Blackwell, 
Trzesniewski, and Dweck 
study improved from 
pre- to postintervention, 
whereas the control 
group showed a 
continued downward 
trajectory in performance 



in teaching the intervention group 
students about the incremental 
theory of intelligence. The results 
from a theory of intelligence ques- 
tionnaire given to students before 
and after the intervention showed 
that participants in the interven- 
tion group changed their opinions 
toward a more incremental view of 
intelligence after the intervention. 

The researchers reported that a 
paired sample f-test (see box 2 for definition of key 
terms) was significant (t = 3.57, p < .05, Cohen’s 
d = .66), indicating that the intervention group en- 
dorsed the incremental theory more strongly after 
the intervention (mean score of 4.95 on the ques- 
tionnaire) than before (4.36). The control group 
mean score on the questionnaire did not change 
(4.62 preintervention and 4.68 postintervention; 
t = 0.32 and not significant, Cohen’s d = .07). 



The important question, then, was whether 
achievement was higher in the intervention group 
as a result of the intervention. The researchers 
assessed the effect of the intervention on academic 
achievement by examining the growth curves 
of participants’ math scores across three points 
in time: spring of grade 6 to fall of grade 7 (both 
prior to the intervention) and spring of grade 
7 (postintervention). The researchers noted an 
overall downward trajectory in the mean math 



scores for the entire sample (spring grade 6, 2.86; 
fall grade 7, 2.33; spring grade 7, 2.11). Analysis 
revealed a significant decline in scores for the total 
sample between the spring of grade 6 and fall of 
grade 7 (b = -.34, t - -4.29, p < .05) and between 
fall of grade 7 and spring of grade 7 (b — -.20, 
t= -2.61, p < .05). 



The researchers further reported that the interven- 
tion group improved from pre- to postintervention 
(fall of grade 7 to spring of grade 7), whereas the 
control group showed a continued downward 
trajectory in performance (figure 1). That is, 
the intervention had a significant positive effect 
(b = .53, t-2.92>,p < .05) on math scores from the 
fall of grade 7 to the spring of grade 7. 

The researchers also collected comments from 
math teachers about students who had shown 
changes in motivational behavior after the advisory 
class sessions. (The teachers did not know to which 
condition their students had been assigned.) The 
study reported that 27 percent of the intervention 
group students received positive comments from 
math teachers about motivational change after the 
intervention, compared with 9 percent of the con- 
trol group, a statistically significant difference. 

Methodological review. No reservations were iden- 
tified concerning the methodological quality of the 
study based on the study quality review protocol 



BOX 2 

Key terms 

t-statistic. For a given sample size, 
the t-statistic indicates how often 
differences in means as large as or 
larger than those reported would be 
found when there is no true popula- 
tion difference in means (the “null 
hypothesis”). For example, a reported 
t-statistic that is statistically signifi- 
cant with a p-value of .05 indicates 
that in only 5 of 100 instances would 
this difference between the means in 



a sample be found if the real popula- 
tion difference were zero. 

Degrees of freedom. The number of 
independent observations used in a 
given statistical calculation and typi- 
cally calculated by subtracting 1 from 
the number of independent observa- 
tions (sample size). 

b-statistic. Represents the slope of a 
regression line based on predictors 
measured in their naturally occur- 
ring units. 



F-statistic. Represents the ratio of the 
between-group variation divided by 
the within-group variation. A statisti- 
cally significant F-statistic indicates 
that the mean is not the same for all 
groups (conditions). 

Effect size. The impact of an effect ex- 
pressed in standard deviation units. 

Cohen’s d. A type of effect size that 
represents the standardized mean 
difference between the treatment and 
control groups. 




FINDINGS OF THREE EXPERIMENTAL STUDIES OF INTERVENTIONS TO REDUCE STEREOTYPE THREAT 



7 



FIGURE 1 

Estimated mean math scores by experimental 
condition 




Spring Fall grade 7 Spring grade 7 

grade 6 preintervention postintervention 



assistants, not teachers. Thus, it is also unknown 
to what extent the intervention effect would hold 
up if delivered by teachers rather than trained 
undergraduates who, in this case, were closer in 
age to the students. 



Study 2: Good, Aronson, and Inzlicht 2003, "Improving 
adolescents' standardized test performance: an 
intervention to reduce the effects of stereotype threat" 

Intervention idea 

• Teach students that their difficulties in school 
are often part of a normal “learning curve” or 
adjustment process, rather than something 
unique to them or their racial group. 



Source: Blackwell, Trzesniewski, and Dweck 2007. 



criteria. (See summary of quality criteria on this 
study in appendix B.) 

Conclusions. The researchers suggest that the 
incremental theory intervention “appears to have 
succeeded in halting the decline in mathemat- 
ics achievement” (p. 258). Future research on 
the role of teachers in changing students’ beliefs 
about intelligence is needed, though these results 
are promising, particularly as the treatment was 
found to yield a significant effect in a low-income, 
urban setting where problems associated with 
minority underperformance can be severe. 

Study limitations. This study was conducted in 
a single school, and thus the uniqueness of the 
school context or population as the setting for the 
intervention is unknown. Another limitation in 
generalizing the results of this study is that the 
sample of students was racially mixed (primarily 
Hispanic and Black), making it difficult to deter- 
mine whether the intervention benefited both mi- 
nority groups equally. The study authors acknowl- 
edge that the effects were measured at a single 
point in time, and it is not known whether the 
effects of the intervention would hold up for stu- 
dents as they moved to grade 8. The intervention 
sessions were delivered by trained undergraduate 



A related potentially unproductive thought process 
occurs when students attribute academic struggles 
to their intellectual limitations, which may be more 
likely for students who struggle with stereotypes 
about their group’s intellectual inferiority. To the 
extent that students attribute normal difficulties — 
for instance, those that occur with hard-to-learn 
topics or concepts— to fixed personal inadequacies, 
they may experience more distraction, anxiety, 
and pessimism. Thus, interventions might reduce 
the negative effects of stereotype threat, as well 
as other forms of doubt, by encouraging students 
to attribute difficulty in school to the transitory 
struggles all students experience. 



Research question. Can 
teaching students to attri- 
bute academic difficulties 
to transitory situational 
causes rather than to 
stable personal causes im- 
prove standardized math 
and reading test scores? 

Study sample. The 
study took place in a 
rural school district in 
Texas serving a largely 
low-income and pre- 
dominantly minority 



The Good, Aronson, 
and Inzlicht study asks 
whether interventions 
might reduce the 
negative effects of 
stereotype threat, as 
well as other forms of 
doubt, by encouraging 
students to attribute 
difficulty in school to 
the transitory struggles 
all students experience 
rather than to fixed 
personal inadequacies 
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population (63 percent Hispanic, 15 percent Black, 
and 22 percent White). Study participants were 
138 grade 7 students enrolled in a computer skills 
class as part of their junior high school curricu- 
lum. Enrollment in the course was randomly 
determined by the school administration, and all 
students in the course participated in the study. 

As part of the regular course curriculum, students 
learned a variety of computer skills including 
using email and designing web pages. 



What was the intervention? Shortly after the school 
year began (mid-October), students in the com- 
puter skills class were randomly assigned a mentor 
with whom they communicated in person and by 
email throughout the school year. They were also 
randomly assigned to receive one of four types of 
educational messages from their mentors: 



• Incremental message (40 students). Students 
learned about the expandable nature of intelli- 
gence (as explored in the previously described 
study). 

• Attribution message (36 students). Students 
learned about the tendency for all students to 
initially experience difficulty during grade 7 
and about the tendency for this difficulty to sub- 
side with time and for performance to improve. 



Combination of messages (30 students). 
Students received both the incremental and 
attribution messages. 



Statistical tests for the 
Good, Aronson, and 
Inzlicht study showed 
that scores on the 
state reading test were 
significantly higher 
for students in the 
conditions receiving the 
incremental (malleable) 
intelligence message and 
the attributional message 
than for students in 
the control group 



• Control condition (32 stu- 
dents). Students learned about 
the perils of drug use (an 
unrelated topic). 

The mentors conveyed the con- 
tent of their assigned messages in 
person to the students during two 
school visits of 90 minutes each. 
After learning this information, 
students created public service 
announcements on the web with 
guidance from their mentor, 



reinforcing the message that they had learned 
and helping to internalize the message through 
a self-persuasion process. A restricted web space 
was created for each of the four conditions so that 
students learning a particular message could read 
more about their assigned message but not read 
the messages for the other three groups and ac- 
quire additional ideas for polishing their web page. 

The mentors were 25 college students who partici- 
pated in a three-hour training session on mentor- 
ing required by the district and then supplemen- 
tary training by the researchers on how to convey 
the four messages tied to the four conditions in the 
study. The same mentors delivered the interven- 
tion to students in three of the four conditions. 

Results. At the end of the school year participating 
students’ scores on statewide standardized tests 
in math and reading were analyzed for the four 
groups of students. 

Math test scores were analyzed using a 2 (gender) 
by 4 (experimental condition) analysis of variance. 
The math analyses are not presented here because 
they focused on understanding gender effects, 
which were not the focus of this report. 

Reading scores on the Texas Assessment of 
Academic Skills were analyzed using a one-way 
analysis of variance that compared the per- 
formance of students participating in the four 
experimental conditions. Although the research- 
ers were interested in differences between Black 
and White students’ performance in the four 
conditions, the samples were not large enough to 
analyze the two groups separately. The analysis 
of variance conducted on the state reading test 
scores revealed a significant effect (p < .05) of the 
conditions, F (3,125) = 2.71. Follow-up statisti- 
cal tests showed that scores on the state reading 
test were significantly higher for students in the 
conditions receiving the incremental (malleable) 
intelligence message (mean score of 88.26) and 
the attributional message (89.62) than for students 
in the control group (84.38) (table 1). There was 
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TABLE 1 

Reported intervention impacts on the spring grade 7 state reading test (Texas Assessment of Academic 
Skills) 



Intervention 

effect 


Incremental 

condition 


Attribution 

condition 


Combined 

condition 


Control 

condition 


Mean reading score 


88.26 


89.62 


86.71 


84.38 


Standard deviation 


7.17 


7.01 


8.70 


7.79 


Difference between each 
experimental condition and control 
condition 


f(65) = 2.07 
p < .041 

Cohen's d = . 52 


f(61) = 2.72 

p < .008 

Cohen's d = .71 


Not 

significant 





Source: Good, Aronson, and Inzlicht 2003. 



no significant difference between the combined 
messages condition and the control condition. 

Methodological review. Applying the study quality 
review criteria revealed two limitations of the meth- 
odology (see appendix B for complete summary). 

Random assignment process. The study reported 
that 6 of the 138 students’ scores were removed 
from the analysis, which could be considered a 
disruption in the random assignment process. In 
addition, no evidence was presented of the equiva- 
lence of the four groups on baseline achievement. 
Although the authors reported that these six 
students did not come from any particular experi- 
mental condition or group, it is difficult to know 
how well the random assignment process worked 
in creating equivalent groups at baseline without 
these data. Therefore, the study results showing 
differences between experimental conditions after 
the treatment should be interpreted with caution. 

Intervention contamination. The same mentors 
delivered the intervention to students in three of 
the four conditions, so the intervention conditions 
could have been somewhat blurred if the mentors 
brought knowledge from one condition to their 
delivery of another. However, under What Works 
Clearinghouse review standards, contamination 
such as occurred in this study is not considered 
grounds for downgrading a study. 

Conclusions. The authors suggest that showing 
the positive impact of these attitude-changing 



interventions on state test performance builds on 
prior experimental studies showing the effects of 
similar interventions on college students’ class- 
room performance (see Wilson and Linville 1985). 

Study limitations. The sample in this study was 
mixed. Although it consisted mainly of minority 
students, Hispanic students made up 63 percent of 
the sample and Black students only 15 percent. So, 
there are limitations in generalizing the findings to 
Black students alone. As in the first study, teachers 
did not deliver the intervention and thus it is dif- 
ficult to know under what conditions teachers can 
effectively deliver the intervention (for instance, 
how much teacher training would be needed, what 
kind of materials would they use). Nevertheless, the 
results are interesting, especially the finding of the 
intervention conditions’ significant effect on aca- 
demic achievement in a low-income school setting. 



Study 3: Cohen, Garcia, Apfel, and Master 
2006, "Reducing the racial achievement gap: 
a social-psychological intervention" 

Intervention idea 

• Help students reflect on other values in their 
lives beyond school that are sources of self- 
worth for them. 

Another route to alleviating stereotype threat 
is to allow individuals to affirm an alternative 
positive identity — one that shores up their sense 
of self-worth in the face of threat. Through 
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The Cohen et al. study 
examined whether 
allowing students to 
affirm an alternative 
positive identity one 
that shores up their sense 
of self-worth in the face of 
threat would alleviate 
stereotype threat 



self-affirmation people reinforce 
their sense of personal worth or 
integrity by reflecting on sources 
of value and meaning in their 
lives (Steele 1988). People are 
better able to tolerate psychologi- 
cal threat in one domain (such as 
school) if they can shore up their 
self-worth in another domain 
(such as family). Laboratory 
research shows that self-affir- 
mations can reduce stress (Creswell et al. 2005). 
For example, college students asked to give a 
speech in front of a sullen audience displayed 
lower levels of the stress hormone cortisol if they 
were first given the opportunity to engage in 
the self-affirmation exercise of reflecting on an 
important value, such as their relationships with 
friends. 



What was the intervention? The intervention was 
intended to engage students in a self-affirmation 
process that would alleviate some of the stress 
Black students might feel from stereotype threat 
and thereby improve their academic performance. 
The affirmation intervention was a series of writ- 
ing assignments designed to induce feelings of self- 
worth and test whether psychological threat could 
be lessened through asking students to “reaffirm” 
their “self-integrity.” The assignments (developed 
by the researchers) were provided to students in an 
envelope and included self-explanatory instruc- 
tions that required little teacher involvement. The 
teachers’ role in the study was to hand out the en- 
velopes containing the writing assignments, pro- 
vide a brief scripted introduction to students, and 
then to remain at their desks and allow students 
to independently complete the assignment and 
return their work to the teacher in the envelope. 



Research question. Would Black students perform 
significantly better in a targeted course when they 
received a self-affirmation intervention than when 
they did not? The researchers hypothesized less 
of an intervention effect for the nonstereotyped 
group, as the risk factors (elevated stress and 
psychological threat) were expected to be lower for 
nonstereotyped students, who do not contend with 
a negative stereotype about their racial group. 

Study sample. The researchers report the results 
of two randomized experiments. The second, a 
replication study, took place a year after the first 
study and with a different cohort of students. A 
total of 119 Black and 124 White grade 7 students 
participated in the two studies (roughly evenly 
distributed across the two studies). Students were 
from a suburban northeastern middle school. 

The three teachers who participated all taught 
the same subject area. At the beginning of the fall 
semester students were randomly assigned to an 
intervention or control condition. Teachers were 
unaware of which students in their classes were 
assigned to which of the two conditions, and the 
two experimental conditions as described below 
were presented to students as part of the regular 
classroom curriculum. 



The envelopes were identical for the intervention 
condition and the control condition assignments, 
so teachers were unaware of which students were 
receiving the self-affirmation intervention. The 
self-affirmation assignment was designed to 
encourage students to think about a personal value 
or values they had singled out as important and its 
significance in their lives. 

Students in both groups received a list of values and 
were asked to read and think about them. The val- 
ues were notions such as athletic ability, creativity, 
music, relationships with friends, independence, re- 
ligious values, and sense of humor. The instructions 
for students in the intervention group asked them 
to select their most important value (or values) and 
to write a paragraph about its importance to them. 
The instructions for students in the control group 
asked them to select their least important value (or 
values) from the list and write about why it might 
be important to someone else. The instructions 
then asked the students in the intervention group to 
write the top two reasons why the value (or values) 
they selected was important to them. The students 
in the control group were instructed to write the 
top two reasons why someone else might consider 
their least important value important. Finally, the 
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instructions asked students to select their level of 
agreement with four statements about the values 
they chose (most important value for the interven- 
tion condition and least important for control con- 
dition) as a way of reinforcing their value selection 
in the affirmation condition. 

Teachers presented the instructions to students 
as a regular classroom assignment. Completing 
the assignment took students in both interven- 
tion and control conditions about 15 minutes. One 
structured writing assignment was provided to 
students in the first study, and two were provided 
to students in the replication study. 

Results. The outcome data collected were students’ 
GPAs from official transcripts in the targeted course 
for the fall term in which the intervention was 
delivered. The data were analyzed using multiple 
regression. The interaction of race (Black or White) 
and experimental condition (affirmation interven- 
tion condition or control condition) was significant 
for study 1 {b = 0.29, f(98) = 2.00 , p < .05) and 
study 2 (b- 0.52, £(119) = 2.80, p < .01), as was the 
treatment main effect for Black students in study 1 
(b = 0.26, f (41) = 2.44, p < .02) and study 2 (b = 0.34, 
f(60) = 2.69, p < .01). Black students receiving the af- 
firmation intervention had higher grades in the tar- 
geted course in the fall term than did Black students 
in the control condition. The difference in GPA for 
Black students in the intervention condition and the 
control condition was 0.26 point in the first study 
and 0.34 point in the second replication study. 



The mean differences in 
the outcome measure for 
Black students and White 
students by three levels of 
prior academic perfor- 
mance are shown in 
table 2. The study reports 
that the intervention was 
as strong for previously 
low-performing Black 

students (£(31) = 2.74, p <.01) as for previously 
moderate-performing Black students (£(30) = 2.40, 
p < .02). The previously high-performing Black 
students benefited less from the intervention con- 
dition (£(31) = 1.72, p < .10). 



In the Cohen et al. 
study Black students 
receiving the affirmation 
intervention had higher 
grades in the targeted 
course in the fall term 
than did Black students 
in the control condition 



The intervention effect on the difference in GPA 
between Black students receiving the affirma- 
tion intervention and those in the control group 
was 0.43 point for the previously low-performing 
group, 0.44 point for the previously moderate-per- 
forming group, and 0.22 point for the previously 
high-performing group. In all three cases Black 
students who received the affirmation interven- 
tion had a higher mean GPA in the course than 
did Black students in the control group. Addition- 
ally, the intervention effect for Black students 
extended to courses beyond the targeted course, as 
evidenced in an analysis of students’ mean GPA in 
core academic courses. 



Combining data from studies 1 and 2 shows that 
the intervention reduced the percentage of Black 



TABLE 2 

Covariate-adjusted mean grade point average (averaged over both studies) for intervention and control 
groups, by level of preintervention performance 





Low performing 
student group 


Moderate performing 
student group 


High performing 
student group 


Condition 


Black 


White 


Black 


White 


Black 


White 


Affirmation intervention 


1.7 


2.2 


2.8 


3.3 


3.5 


4.0 


Control 


1.3 


2.3 


2.4 


3.2 


3.3 


4.0 



/Vofe.'The grade point average is that received in the fall term in the academic subject in which the experiment was carried out at the beginning of the 
school year. The academic subject area was not identified in the study except to say that it was not one that was typically related to gender stereotypes (for 
example, math). 



Source: Cohen et al. 2006. 
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students earning a D or below in the fall term of 
the course from 20 percent in the control group, a 
rate consistent with historical norms at the school, 
to 9 percent in the intervention group, a significant 
difference (figure 2). There was no significant dif- 
ference between the intervention and the control 
conditions for White students. 

Methodological review. No reservations were 
identified concerning the methodological quality 
of this study on the study quality review protocol 
criteria (see appendix B for details). 

Conclusions. The authors conclude that “our 
intervention is among the first aimed purely at 
altering psychological experience to reduce the 
racial achievement gap.” That is, rather than “lift 
all ships,” the intervention benefits those most in 
need — low-performing Black students. Addition- 
ally, “the research highlights the importance of 
situational threats linked to group identity in 
understanding intellectual achievement in real- 
world, chronically evaluative settings . . . [and] 
challenge^] conventional and scientific wisdom by 
demonstrating that a psychological intervention, 



although brief, can help reduce what many view 
as an intractable disparity in real-world academic 
outcomes” (p. 6). 

Study limitations. Limitations of the study include 
the fact that it was conducted in only one school and 
grade level in a suburban district and that it is dif- 
ficult to determine how representative the sample is 
of the general population from which it was drawn. 
It is thus difficult to know whether the interven- 
tion would yield similar benefits in other schools of 
varying demographic and socioeconomic charac- 
teristics and in other grade levels. Additionally, as 
with the other two interventions reported here, it is 
unclear whether the intervention would be similarly 
beneficial when prepared and implemented entirely 
by teachers rather than trained researchers. Still, 
the results are promising, as the intervention effect 
proved replicable (obtained in two separate stud- 
ies), and the effect of the intervention on minorities’ 
grades was consistently positive across most of the 
range of prior achievement. 



CONCLUDING THOUGHTS ON TURNING 
RESEARCH INTO PRACTICE 



FIGURE 2 

Percentage of Black and White students receiving 
a grade of D or lower in targeted course in same 
semester as the intervention, by experimental 
condition 



Percent 




Source: Cohen et al. 2006. 



The objective of this report was to conduct a sys- 
tematic search to identify classroom-based strate- 
gies designed to reduce stereotype threat and thus 
to improve the academic performance of Black 
students. The three studies that were identified 
found that the following social-psychological strat- 
egies had impacts on minority group achievement: 

• Reinforce for students the idea that intelli- 
gence is expandable and, like a muscle, grows 
stronger when worked. 

• Teach students that their difficulties in school 
are often part of a normal “learning curve” or 
adjustment process, rather than something 
unique to them or their racial group. 

• Help students reflect on other values in their 
lives beyond school that are sources of self- 
worth for them. 
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When considering these studies, several limita- 
tions of this review are important. First, the 
search was very focused, intended to identify only 
studies of interventions that had been tried in real 
school settings. For each strategy, there is a larger 
body of social-psychological theory and research 
that led to the testing of the particular interven- 
tion that is not reviewed. Few social-psychological 
studies are conducted in classroom settings, but 
it was important to focus only on studies with 
possible applicability for educators. Another 
limitation is that these strategies do not repre- 
sent all the possible ways of reducing stereotype 
threat, only those that have been studied with 
rigorous research. There may be other, better ways 
of reducing stereotype threat that have not been 
studied. 

Finally, readers should be aware that the studies 
here are small in scope, and their replicability is 
unknown. Flowever, it is clear that the stereotype 
threat phenomenon has been experimentally 
shown to exist across a wide variety of studies. 
Thus, it is important to share ideas for reducing 
the negative effects of this phenomenon, even if 
they are in the early stages of knowledge devel- 
opment. For the three experiments reported on 
here, evidence suggests that such strategies might 
reduce the level of psychological threat some Black 
students feel in the classroom and that, combined 
with other efforts, these strategies could benefit 
the performance of Black students. 

Although researchers have developed specific 
protocols to follow for the interventions in some 
contexts, educators might need to adapt the inter- 
ventions to fit their classrooms and then moni- 
tor them to determine what impact they have. 

An understanding of the purpose and process 
involved in using the strategy is important, as 
is professional wisdom about how to apply the 
process in a given classroom context. Such under- 
standing and awareness help ensure that the spirit 
of the intervention is not lost when local condi- 
tions prevent a teacher from strictly following 
the protocols. If school teams or teachers do not 
grapple with the underlying rationale or purpose 



of an intervention, key 
elements may be left out, 
rendering the interven- 
tion less effective. 



The evidence suggests 
that strategies such 
as those analyzed in 
the three experiments 
reported on here might 
reduce the level of 
psychological threat 
some Black students 
feel in the classroom 
and that, combined 
with other efforts, 
these strategies could 
benefit the performance 
of Black students 



For example, the tim- 
ing of interventions is 
important. The interven- 
tions in the Blackwell, 

Trzesniewski, and Dweck 
(2007) and Cohen, Garcia, 

Apfel, and Master (2006) 
studies seemed to halt or 
at least slow a downward 
performance spiral for 
students. All three stud- 
ies were conducted on students in grade 7, which 
raises the possibility that there may be windows 
of opportunity for influencing student attitudes 
and beliefs. For instance, grade 7 is a time when 
concerns about race-based stereotype increase for 
minority students and is a developmental period 
when adolescents’ sense of identity is in flux. In- 
terventions may be particularly influential at such 
junctures by altering students’ early trajectory and 
preventing a path of compounding failures. 



Thus, the grade level at which the intervention 
ideas are applied is an important consideration, 
as is the timing during the year. For example, the 
self-affirmation assignment may be most effec- 
tive when given at times of high stress, such as 
the beginning of the school year, to halt or reverse 
a downward slide that could otherwise feed off 
itself, with stress worsening performance and with 
deteriorating performance heightening stress in 
a repeating cycle. Such downward slides coincide 
with academic transitions, such as the transition 
to middle school, high school, or college. These 
are times when performance standards shift 
upward, when students’ sense of identity is not 
yet crystallized, and when social-support circles 
are disrupted, heightening stress and feelings of 
exclusion. If a small psychological intervention 
can interrupt a downward spiral at such times, or 
prevent it from emerging, there is the possibility of 
large and long-term effects (Cohen et al. 2006). 
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The three studies 
reported here suggest 
that seemingly 
small actions in the 
classroom when well 
timed, well targeted, 
and thoughtfully 
and systematically 
implemented can 
produce positive results 
for minority students 



Social-psychological research 
suggests that human intellectual 
performance and motivation are 
fragile (Aronson and Inzlicht 
2004; Aronson and Steele 2005). 
The three studies reported here 
suggest that seemingly small 
actions in the classroom — when 
well timed, well targeted, and 
thoughtfully and systematically 
implemented — can produce posi- 
tive results for minority students. 



It is important to bear in mind, however, that 
none of these interventions would work unless 
students already have some ability or motivation 
to improve academically and unless the school 
has the foundational resources to permit students 
to achieve at a higher level. The interventions will 



not teach a student to spell who does not already 
know the fundamentals. They will not suddenly 
motivate an unmotivated student or turn a 
low-performing and underfunded school into a 
model school. More generally, the interventions 
would not work if there were not broader posi- 
tive forces in the school environment (committed 
staff, quality curriculum) operating to facilitate 
student learning and performance. Without these 
broader positive forces, social-psychological 
interventions, while potentially reducing psycho- 
logical threat levels for some students, would be 
unlikely to boost student learning and achieve- 
ment. However, when these broader positive 
forces are in place, social-psychological interven- 
tions such as those reported on here may help 
Black and other minority students to overcome 
stereotype threat and improve their performance 
in school. 
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TABLE 3 

Summary of effects reported by the three studies 



Study 


Outcome measure 


Analysis 

technique 


Treatment effect 3 


Description of difference between 
intervention and control groups 


Blackwell, 
Trzesniewski, 
and Dweck 
(2007) 


Predicted math 
grades on a 0 [F] to 
4.0 [A] scale 


Growth curve 
analysis 


f(371) = 2.93, p < .05 


According to a figure presented in 
the study report, the intervention 
group averaged roughly a 0.10 
increase in math course grades 



from fall to spring. The increase 
does not represent a letter grade 
change (such as C+ to B-); it 
remains within the C+ range. 
However, the increase for the 
intervention group from fall to 
spring, though small, contrasted 
with a decline in grades for the 
control group from fall to spring 
(from roughly a C+ to a C). 



Good, 
Aronson, 
and Inzlicht 
(2003) 



Reading Analysis of 

achievement variance 

scores on the 

Texas Assessment 

of Academic 

Skills (TAAS) 

standardized tests 



Incremental condition: 
f(65) = 2.07, p < .05 

Attributional condition: 
f(61) = 2.72, p < .01 

Effect sizes reported: 

• Between students who 
received the message on 
the incremental theory of 
intelligence and students in the 
control group: Cohen's d= .52 

• Between students who 
received the attributional 
message and students in the 
control group: Cohen's d = .71 



Compared with students in the 
control condition, students in the 
incremental condition earned 
an average 3.88 points higher 
on the TAAS and students in 
the attributional intervention 
condition an average of 5.24 
points higher. 

The .52 and .71 effect sizes 
reported are considered moderate 
to large effects for educational 
interventions. 



Cohen, 


Targeted course 


Multiple 


Study 1: 


According to the authors, the 


Garcia, Apfel, 
and Master 
(2006) 


grade point 
average, on a 0 [F] 
to 4.33 [A+] grade 
point average scale 


regression 


f(41) = 2.44, p < .02 

Study 2 (replication): 
f(60) = 2.69, p < .01 


intervention effects translated 
into an estimated 0.26 point 
increase in study 1 and 0.34 point 
increase in study 2, respectively, 
in fall targeted course grades for 
Black students in the intervention 
condition compared with those in 
the control condition. 



Note: Only the Cohen et al. (2006) study directly analyzed the reduction in the achievement gap between Black and White students. The other two studies 
reported on positive effects of the intervention on the overall sample of students, which included primarily minority (Black and Hispanic) students. However, 
the two studies were not able to compare minority student improvement with that of White students. 

a. The effect size statistic represents the impact of the effect in standard deviation units. Because only the Good, Aronson, and Inzlicht (2003) study calcu- 
lated and reported effect sizes, effect sizes could not be compared across the studies. Instead f-statistics and corresponding p-values are reported. For a 
given sample size, the t-statistic indicates how often differences in means as large as or larger than those reported would be found when there is no true 
population difference in means (the null hypothesis). The number in parentheses with the f-statistic indicates the degrees of freedom. Cohen's d, a type of 
effect size, represents the standardized mean difference between the intervention and control groups. It is calculated by dividing the difference between 
the intervention group and control group means by either their average standard deviation or by the standard deviation of the control group. See box 2 for 
more detailed definitions. 



Source: Authors' compilation and calculation from Blackwell, Trzesniewski, and Dweck (2007); Good, Aronson, and Inzlicht (2003); and Cohen et al. (2006). 
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NOTES 

1. This report uses the term Black students 
throughout, even when the reported study 
used a different term. 

2. The No Child Left Behind Act of 2001 refers to 
“scientifically based research” as an impor- 
tant criterion for educators as they consider 
new interventions or strategies. Randomized 
controlled experiments are said to be the 
“gold standard” of the sciences, the highest 
standard of evidence or methodology avail- 
able for studying the effectiveness or impact 
of an intervention. In such experiments 
participants are randomly assigned to one of 
two or more conditions that differ in a critical 
way that is hypothesized to have a particu- 
lar impact. At the simplest level there is an 



intervention group that receives the interven- 
tion and a control group that does not. If the 
students randomly assigned to the interven- 
tion group perform significantly better on 
the outcome measure than do students in the 
control group (a less than 5 percent probabil- 
ity of the difference between the two groups 
being due to chance), it is likely that the 
difference in performance was the result of 
the intervention. Random assignment creates 
groups that should be (on average) identical in 
all dimensions except for receiving the inter- 
vention; thus, any differences in outcomes can 
be attributed to the intervention. The three 
published studies identified and examined in 
this report use this type of research design 
for testing interventions to reduce stereotype 
threat in classrooms and improve academic 
performance. 
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APPENDIX A 

RESEARCH ON THE RELATIONSHIP 

BETWEEN STEREOTYPE THREAT AND BLACK 

STUDENTS' ACADEMIC PERFORMANCE 

Students’ academic performance in classrooms, 
because of processes such as stereotype threat, 
can be more variable than people customarily 
think, fluctuating with changes in the situation 
(Aronson and Steele 2005). For example, studies 
show that women’s performance on math tests 
can be made to rise and fall with surprising ease. 
When women were asked to generate a short list 
of qualities shared by men and women, their math 
test performance rose (Rosenthal and Crisp 2006). 
In another study, when women were reminded 
that they were students at a selective liberal arts 
college and their attention was thus turned away 
from their gender, women’s spatial-abilities test 
performance rose and the male-female gap shrank 
(McGlone and Aronson 2006). When women took 
a test in the presence of men, their math perfor- 
mance declined (Inzlicht and Ben-Zeev 2000). But 
when women were presented with a female test 
proctor who excelled in math, their performance 
improved and the male-female gap again shrank 
(Marx and Roman 2002). Such studies underscore 
the degree to which human performance is shaped 
by environmental and psychological forces — not 
simply by how smart a student is or how hard he 
or she works. 

Research on stereotype threat began with labora- 
tory studies exploring why Black college students 
seemed to be performing below their potential. 
Although a test-taking situation may seem objec- 
tively the same for all students, some students, 
because of their social identity, may experience it 
in a very different way. 

Steele and Aronson (1995) conducted an experi- 
ment to explore the negative impact of adminis- 
tering a test under potentially stereotype-threat- 
inducing conditions by randomly assigning study 
participants to two different test-taking condi- 
tions. In one test-taking condition a standardized 
test (composed of verbal Graduate Record Exam 



items) was presented to one group of college stu- 
dents as “diagnostic of intellectual ability.” It was 
hypothesized that Black students in this condition 
would worry that performing poorly could con- 
firm a stereotype about their racial group’s intel- 
lectual ability. Black students performed worse in 
this condition than when the same test was given 
in a second condition that introduced the test as 
one that was “not diagnostic of your ability.” The 
two ways of introducing the test had no effect on 
the performance of White students. Black stu- 
dents in the study sample answered roughly 8 of 
30 test items correctly in the “threat” condition 
and roughly 12 of 30 correctly in the “no threat” 
condition. 

Since this first Steele and Aronson study, the con- 
cept of heightened performance stress or anxiety 
for certain groups has been found across a variety 
of potential stereotypes and minority groups. 
Experimental studies have shown that detrimental 
stereotype threat affects not only Black students 
on verbal tests, but Hispanic students on verbal 
tests (Aronson 2002), young women on math tests 
(Quinn and Spencer 2001; Spencer, Steele, and 
Quinn 1999), White men in certain sports situ- 
ations (Stone et al. 1999), students from socio- 
economically disadvantaged households on school 
tests (Croizet and Claire 1998), and high-perform- 
ing White students on math tests when they are 
reminded of the stereotype of Asian superiority in 
math (Aronson et al. 1999). 



Direct and indirect manipulations of stereotype threat 

Experimental manipulations of stereotype threat 
have differed, and these differences can be relevant 
to test-taking instructions used in K-12 settings 
(Quinn and Spencer 2001). One direct way of in- 
ducing stereotype threat in experiments has been 
to tell the test-taking group that the test they will 
take has been sensitive to group differences in the 
past (for example, “this test shows racial differ- 
ences”), thus raising the potential relevance of the 
stereotype as an explanation for the test taker’s 
poor performance. Although drawing attention to 
group differences just before administering a test 
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(for example, stating that girls have performed good student, can improve performance (McGlone 

worse than boys on the math test in the past, or and Aronson 2006). 

that Black students as a group performed poorly 

on the test the previous year) could cause a few Mediating mechanisms 

students to rise to the challenge, the laboratory 

research suggests that the average performance of Although inducing stereotype threat conditions 

negatively stereotyped group members decreases. has been shown across multiple studies to result in 

The fact that some of this laboratory research was poorer performance from the stereotyped group, 

conducted with college students on elite campuses the research has been less clear on the mediating 

(Steele and Aronson 1995) suggests that such a mechanisms — on why stereotype threat results in 

detrimental effect could occur even among the poorer performance, 

most confident and skilled students. 



A less direct way of studying the negative ef- 
fects of stereotype threat has been to inform the 
students in the study that the test is “diagnostic 
of your ability” (as in Steele and Aronson 1995). 
This conveys that the test is designed to evaluate 
students’ performance along a stereotype-relevant 
trait (intellectual ability) and consequently can 
bring to the fore concerns about confirming the 
stereotype. Experimental studies have shown that 
the performance of the stereotyped group tended 
to be poorer in the group that received the instruc- 
tion that the test was diagnostic of ability than in 
the comparison group that received instructions 
emphasizing that the test is not diagnostic of abil- 
ity (Spencer et al. 1999; Steele and Aronson 1995). 

The power of these direct and indirect ways of 
inducing stereotype threat relates to a general 
psychological principle that has been widely 
studied — the priming effect. The priming effect 
refers to the tendency for people to conform their 
thoughts, feelings, and behaviors to psychologi- 
cally accessible mental constructs such as stereo- 
types. Thus, when individuals are “primed” with 
a negative stereotype, their interpretations of 
ambiguous stimuli, behaviors, or performances 
are often influenced by the stereotype, even when 
the priming occurs at the unconscious or sub- 
liminal level. The implication of priming effects 
for teachers trying to encourage their students to 
perform to their potential is that subtle events in 
the classroom can undermine a student’s confi- 
dence, trust, and performance. Studies also show 
that priming positive concepts, such as being a 



Some researchers have studied mediating mecha- 
nisms that might interfere with the quality of 
the performance under conditions of stereotype 
threat such as increases in stress, anxiety, self- 
consciousness, mental load, or heightened de- 
mands on working memory — all of which could 
lead to less focus on the task at hand, suboptimal 
test-taking strategies (such as guessing more), and 
underperformance (Beilock et al. 2006; Schmader 
and Johns 2003). Making students aware of the 
effects of anxiety from stereotype threat has 
been shown in several studies to improve the 
performance of negatively stereotyped students 
(Johns, Schmader, and Martens 2005; McGlone 
and Aronson 2007), presumably because aware- 
ness of external pressures reduces the tendency to 
attribute test anxiety to one’s intellectual short- 
comings by providing an alternative attribution. 
The study findings suggest that helping students 
understand stereotype threat might inoculate 
them in some way against the extra stress or lack 
of focus that might take their attention away from 
the performance at hand. 



Experiencing stereotype threat over time 

Although difficult to study, some long-term effects 
of repetitively experiencing the extra stress due 
to stereotype threat have been suggested. One 
consequence might be that as Black students have 
the opportunity to make choices in school, some 
of them might avoid challenges by selecting easier 
courses or assignments when they are being aca- 
demically evaluated. Studies with middle school 
minority students have found that students asked 
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for easier problems to solve when confronted with 
the prospect of being intellectually evaluated on 
the basis of their performance (Aronson and Good 
2002). Compared with White students, the minor- 
ity students showed a strong tendency to take on 
less challenging work, presumably because they 
were threatened by the prospect of looking less 
intelligent if the challenge proved too great. 



But there were individual differences that moder- 
ated these findings. Minority students were less 
likely to avoid a challenge if they believed that the 
challenge could increase their intelligence. Ad- 
ditionally, reducing stereotype threat through an 
experimental intervention increased minorities’ 
interest in taking challenging rather than easy col- 
lege courses (Walton and Cohen 2007). 



20 



REDUCING STEREOTYPE THREAT IN CLASSROOMS 



APPENDIX B 
METHODOLOGY 

The methodology for this study included a sys- 
tematic search, screening, and review process to 
ensure methodological replicability. 



Search process 

A systematic search was conducted to identify 
empirical studies of classroom-based social- 
psychological interventions designed to reduce ste- 
reotype threat and thus to improve the academic 
performance of Black students. 

The broadest search used the Education Resources 
Information Center (ERIC) and the search term 
“stereotype threat,” resulting in 44 citations. 
Subsequently, narrower search term combinations, 
such as “stereotype threat” and “intervention,” 
and “achievement gap” and “intervention,” were 
used to search several bibliographic databases. 

To identify new literature, Psyclnfo was used to 
search on “stereotype threat” and “social identity 



threat.” Forward citation searches using seminal 
stereotype threat papers and searches of reference 
lists in newly published work were also conducted. 
The searches yielded 158 citations (table Bl). In 
addition, a web site on this topic, with an extensive 
reference list of peer-reviewed journal articles, was 
reviewed (www.reducingstereotypethreat.org). 
Launched on November 28, 2007, the web site was 
developed by Steve Stroessner (Columbia Univer- 
sity) and Catherine Good (Baruch College), but is 
now maintained solely by Stroessner. Until June 26, 
2008, it was updated monthly or bimonthly. Scan- 
ning the web site reference list resulted in an ad- 
ditional 131 citations, for a total of 289 references. 



Screening 

The references were screened twice, first for 
content relevance and then for intervention and 
sample relevance (see appendix C for the six 
screening criteria). 

Initial screening of references. Citation informa- 
tion from these 289 references was entered into an 



TABLE Bl 

Search results 



Search engine or web site 


Database 


Search terms 


Number of 
references 
identified 


ERIC 




stereotype threat 


44 


EBSCOhost 


PsycINFO 


achievement gap and intervention 


0 


EBSCOhost 


Academic Search Premier 


achievement gap and intervention 


0 


Wilson Web 


Education Index 


racial achievement gap and intervention 


0 


EBSCOhost 


PsycINFO 


stereotype threat and intervention 


3 


EBSCOhost 


Academic Search Premier 


stereotype threat and intervention 


1 


Wilson Web 


Education Index 


stereotype threat and intervention 


0 


EBSCOhost 


ERIC 


stereotype threat and intervention 


2 


First Search 


Dissertation abstracts 


stereotype and threat 


108 


First Search 


Dissertation abstracts 


stereotype threat and intervention 


0 


First Search 


Dissertation abstracts 


racial achievement gap and intervention 


0 


www.reducingstereotypethreat.com 


na 


na 


131 


Total references 






289 



na is not applicable. 

Source: Authors' compilation. 
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internal tracking database for documenting dispo- 
sition. These references were first screened for in- 
clusion using three questions on content relevance 
(see article screening protocol in appendix C): 

• Is the article on topic? 

• Is the citation an empirical study? 

• Does the study focus on race-based stereotype 
threat? 

If the title or abstract did not provide enough 
information about the study, the full article was 
reviewed for relevance. Table B2 and figure B1 
show the disposition of references. 

Applying the first set of three criteria in the article 
screening protocol led to 214 exclusions: 

• 87 references, as off-topic or irrelevant. 

• 20 references, which were literature reviews, 
book chapters, or summary articles — not 
empirical studies. 

• 107 references, which focused on gender-based 
stereotype threat (conditions under which 
women perform worse than men on math 
tests) rather than race-based stereotype threat. 

Second-level screening of relevant references. The 

remaining 75 references were subject to a second 
round of screening to determine whether the stud- 
ies met the following criteria: 

• Examined the effect of a social-psychological 
intervention (relevant to reducing the in- 
tensity of the psychological experience of 
stereotype threat) on improvements to student 
academic performance. 

• Included Black students in the sample. 

• Included K-12 students as the focus (not 
college students fulfilling requirements to 
participate in experiments). 



FIGURE B1 



First- and second-level screening and assessment 
of the quality of studies 
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This second round of screening excluded 72 stud- 
ies (see table B2). 

The majority of the studies (65) were excluded 
for not meeting the first criterion. The studies 
explored various aspects of the negative impact of 
stereotype threat on Black students. They did not 
test a social-psychological intervention aimed at 
improving Black student performance by reducing 
stereotype threat or mitigating its effects. 

Two studies were excluded because they did not 
include Black students. Studies that included Black 
students as part of their sample were retained. No 
specific percentage of the sample was stipulated 
as having to be Black students. (Also, no criterion 
was specified for sufficient representation of Black 
students for analyses of outcomes by ethnicity.) 

Of the three studies that remained after screen- 
ing, only one study (Cohen et al. 2006) specifi- 
cally analyzed race as a factor. In the Blackwell, 
Trzesniewski, and Dweck (2007) study the 
students were from a large urban school district, 
and all were minority (52 percent were Black and 
45 percent were Hispanic). In the Good, Aronson, 
and Inzlicht (2003) study, the students were from 
a rural district in Texas with 70 percent eligible 
for free or reduced-price lunch (67 percent were 
Hispanic, 13 percent were Black, and 20 percent 
were White). The researchers noted that previous 
research had demonstrated stereotype threat ef- 
fects for Black, Hispanic, and low-income students 
and argued that, for this reason, “all of the partici- 
pants in the sample were potentially susceptible to 
stereotype threat” (p. 652). In the Cohen, Garcia, 
Apfel, and Master (2006) study, participants were 
from a suburban northeastern middle school 
with a student population equally split between 
Black and White students. Whereas the other 
two studies were conducted in socioeconomically 
disadvantaged settings, this study was conducted 
in a suburban area. However, race (Black or White) 
was used as a factor in the analyses (119 Black 
and 124 White students participating). Interest- 
ingly, all three included studies focused on grade 7 
students. 
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Five studies were excluded because they did not 
include K-12 students as their focus. Though the 
studies examined the impact of an intervention on 
improving Black student performance, the sample 
was college students in laboratory settings, not 
K-12 students. Thus, these studies lacked external 
validity. Although a common practice in certain 
disciplines, it is difficult to generalize results from 
studies conducted with college students to other 
populations, especially to populations that are 
significantly younger. 



Verification search 

Because of the small number of studies identified 
for inclusion, a second broader, verification search 
was conducted to catch any relevant studies that 
might have been missed in the focused search of 
the databases. This verification search used the 
broadest search term of “stereotype threat” without 
the word “intervention,” searching the literature 
using the terms “stereotype threat,” “stereotype,” 
and “threat.” The EBSCO host search engine was 
used to search the ERIC, PsycINFO, Academic 
Search Premier, and Soc Index with Full Text 
databases. Also, the Education Index database 
was searched using Wilson Web, and the Disserta- 
tions Abstracts database was searched using First 
Search. The entire text of identified documents 
was searched, not just keywords or title. The only 
limit placed on the search was the publication year, 
which was set at between 1990 and 2007 (as the 
concept of stereotype threat emerged in the 1990s). 

This search identified 741 references. Reviews of 
the titles and abstracts turned up no additional 
studies appropriate for inclusion. The reasons for 
exclusion were as follows: 74 percent were off-topic, 
14 percent were not empirical, and 12 percent were 
on-topic but did not test an intervention, occur in 
K-12 classrooms, or include Black students. 



Review process: identifying methodological 
limitations of included studies 

The three studies identified as meeting the six 
inclusion criteria in the article screening protocol 



(in appendix C) were reviewed first by a Regional 
Educational Laboratory (REL) Southeast researcher 
using a study quality review protocol (see appendix 
C). The researcher adapted the items on the proto- 
col from one used by REL Central, which provided 
the researcher with background knowledge about 
the meaning of each item. The completed protocols 
for each study and the study articles were then 
examined by an external reviewer trained in What 
Works Clearinghouse (WWC) criteria. 

Development of study quality review protocol. 

Researchers for this study obtained a copy of a cod- 
ing protocol that REL Central had developed using 
the WWC evidence standards (U.S. Department 
of Education 2008) to code studies included in the 
report Using strategy instruction to help struggling 
high schoolers understand what they read (Apthorp 
and Clark 2007). This coding protocol included 
criteria that WWC indicates are important, such 
as adequacy of outcome measure, equivalence of 
groups at baseline, extent of overall and differential 
group attrition, intervention contamination, and 
confounding of teacher and intervention. Also 
included were descriptive items to summarize each 
study, such as independent and dependent variable 
description, summary of analysis and results, and 
an overall narrative summary of the study. 

The REL Central coding protocol was simplified 
for this study, as the intention was to describe any 
limitations in the methodology of the three studies 
based on an interpretation of WWC standards 
and the researchers’ understanding of good sci- 
ence, rather than to conduct a WWC-level review. 
The REL Southeast staff member who developed 
the protocol and who has experience in research 
design used the study quality review protocol to 
gather information from each study on items in the 
protocol: adequacy of outcome measure, random 
assignment process, overall attrition, differential 
group attrition, intervention contamination, and 
confounding factors. A section was not included 
on items related to assessing the quality of quasi- 
experimental designs in the protocol since all three 
identified studies used an experimental design. 

The completed coding protocol on each study was 
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reviewed by the external reviewer, who raised 
questions for clarification with the third researcher 
from REL Southeast and the initial coder. 

Assessing the quality of identified intervention 
studies. The three studies were subject to a final 
quality review to describe any methodological 
limitations, using a study coding protocol (see ap- 
pendix C) based on the five criteria below from the 
What Works Clearinghouse Procedures and Stan- 
dards Handbook (U.S. Department of Education 
2008) for assessing the internal validity of studies 
examining the effects of interventions: 

• Outcome measures. The measures used to assess 
impact must be shown to actually measure what 
they are intended to measure. For studies in 
school settings, common academic achievement 
measures include state- or locally mandated 
tests and course performance (term grades). The 
three studies reported on here used such school 
measures of student achievement. 

• Random assignment process. In experimental 
studies researchers use random assignment to 
assign participants to experimental condi- 
tions (intervention or control) to ensure that 
the groups are as similar as possible on all 
characteristics so that the outcomes measured 
reflect the influence of the intervention only. 
All three of the studies reported on their 
random assignment process, so any threats to 
random assignment could be identified. Only 
one study had a limitation in this area (Good, 
Aronson, and Inzlicht 2003). 

• Attrition of participants. Loss of participants 
can create differences in measured outcomes by 
changing the composition of the intervention 
or control groups. Both overall attrition and 
differential attrition (differences between inter- 
vention and control groups) are of concern. All 
three studies were acceptable in this area. 

• Intervention contamination. Intervention 
contamination can happen when unintended 
events occur after intervention begins. 



Because these new factors could affect group 
outcomes, they also could affect the conclu- 
sions of the experiment. An example is a 
teacher in an intervention group sharing the 
intervention materials with a teacher in a 
control group. One study was noted as having 
a possible limitation in this area (Good, Aron- 
son, and Inzlicht 2003). 

• Confounding factor. It is important to examine 
factors beyond the intervention that might affect 
differences between groups, such as the effects 
of teachers or of the intervention provider more 
generally. For example, if each condition of the 
study involves only one teacher s classroom, then 
the effects of the teacher cannot be separated 
from the effects of the intervention. No studies 
were noted as having problems in this area. 

Methodological review. The methodological 
limitations reported for each study were identi- 
fied through this process. The results of the study 
quality review process are shown in the individual 
descriptions of each study below and summarized 
in table B3. Table B4 summarizes the methodology 
of the three studies. 

Blackwell, Trzesniewski, and Dweck (2007). No 
limitations were noted in applying the quality 
review criteria to this study: 

• Random assignment process. Students were 
randomly assigned by the school to regularly 
scheduled advisory classes (groups of 12-14). 
Each pre-existing advisory group was as- 
signed by the research team to an intervention 
or control condition. The researchers reported 
baseline equivalence data: fall term math 
grades for the students were not significantly 
different for the two groups (2.38 for the inter- 
vention group and 2.41 for the control group). 

• Attrition. The attrition rate (students who did 
not complete the eight-week sessions) was 

5 percent and roughly equivalent for both 
groups (three from the intervention group and 
two from the control group). 
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TABLE B3 

Quality of final studies included in report 



Study 


Adequacy of 
outcome measure 

1 2 3 


Random assign 
ment process 

1 2 3 


Overall attrition 
1 2 3 


Differential 

attrition 

1 2 3 


Intervention 

contamination 

1 2 


Confounding 

factor 

1 2 


Blackwell, 
Trzesniewski, 
and Dweck 
(2007) 


✓ 






✓ 






✓ 






✓ 






✓ 




✓ 




Good, 
Aronson, 
and Inzlicht 
(2003) 


✓ 








✓ 




✓ 






✓ 








✓ 


✓ 




Cohen, 
Garcia, Apfel, 
and Master 
(2006) 


✓ 






✓ 






✓ 






✓ 






✓ 




✓ 





Note: 1 = acceptable; 2 = acceptable with reservations; 3 = not acceptable. 
Source: Authors' compilation. 



• Intervention contamination. There was no 
reporting of any events during the eight 
weekly 25-minute periods that might differ- 
entially affect the two groups. Each advisory 
group was assigned to a condition, making it 
less likely students would share information 
across conditions. 

• Confounding factors. The study used un- 
dergraduate assistants to deliver the eight 
sessions, assigning two undergraduates as 
workshop leaders for each advisory class. Dif- 
ferent workshop leaders were assigned to each 
advisory class. Student participants all had the 
same math teacher during the study period, 
so differences in math teachers could not have 
influenced differences in math grades between 
the intervention and control students. 

Good, Aronson, and Inzlicht (2003). Two limita- 
tions were noted in applying the study quality 
review criteria that might limit the confidence in 
the results of this study. 

• Random assignment process. Six of the 138 
students’ scores were removed from the analy- 
sis. In addition, evidence was not presented on 
the equivalence of the four groups on baseline 



achievement. Although the authors reported 
that the six excluded students did not come 
from any particular condition, it is difficult 
to know how well the random assignment 
process worked in creating equivalent groups 
at baseline. Therefore, results showing differ- 
ences between experimental conditions after 
the intervention should be interpreted with 
caution. 

• Intervention contamination. The same men- 
tors provided the intervention to students 
in three of the four experimental condi- 
tions, so the intervention conditions could 
have been somewhat blurred if the mentors 
brought knowledge from one condition to 
their delivery of another. In addition, students 
were all in the same class so they could have 
discussed or shared their experiences across 
the experimental conditions. Such a problem 
would work against finding a significant 
difference between the control group and the 
other experimental conditions, thus, perhaps 
strengthening confidence in the intervention 
condition effects where found. (Under WWC 
review standards, contamination such as oc- 
curred in this study is not considered grounds 
for downgrading a study.) 
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No limitations were found relative to the attrition 
or confounding factors. 

• Attrition. Roughly 4 percent of students were 
excluded from the reading test analysis based 
on an outlier analysis intended to identify 
students whose test score results represented 
very limited English speaking skills. This 
attrition rate is less than the 20 percent level 
determined as significant attrition. The attri- 
tion was reported as occurring equivalently 
across groups. 

• Confounding factors. The participating stu- 
dents were all part of one class, but the teacher 
did not provide the intervention. Students 
were randomly assigned to one of four condi- 
tions and also randomly assigned to a mentor 
who provided their condition. 

Cohen et al. (2006). No limitations were noted in 
applying the quality review criteria to this study, 
as summarized below. 

• Random assignment process. The article 
reported on two randomized, double-blind 
experiments of an affirmation intervention. 
Students in three teachers’ classrooms were 
involved. Random assignment to either the 
affirmation intervention or control condi- 
tion was at the level of the individual student. 
For each teacher/classroom period, there 
were about equal numbers of students in the 
two conditions. Baseline measures for each 
student (standardized measure of pre-inter- 
vention in-class performance, prior year grade 
point average in core courses, and pre-inter- 
vention test score) were collected and used in 
the analysis as potential covariates. 



• Attrition. Individual student attrition (ab- 
sences, missing data, experimenter error) was 
four students for study 1 (roughly 3 percent 
attrition), leaving 111 students in the final 
sample, and seven students from study 2 
(roughly 5 percent), leaving 132 students in 
the final sample. There was no differential at- 
trition as a function of condition, as indicated 
by the authors in a subsequent correspon- 
dence; baseline covariates were used in the 
analysis. 

• Intervention contamination. There was no 
reporting of events or circumstances that 
might have contributed to contamination. The 
experiment was double-blind, so the teach- 
ers did not know what condition the stu- 
dents were assigned to, nor did the students. 
Additionally, neither group was aware of the 
experimental hypothesis, and students were 
unaware of the intervention. 

• Confounding factors. Students were the unit 
of analysis for the study and were randomly 
assigned to the two conditions in approxi- 
mately equal numbers for each of the three 
teachers. Because fall grades in the targeted 
course were the outcome measure and teach- 
ers may grade differently, the regression 
analysis included a teacher variable (dummy 
codes for the three teachers), a main effect 
of baseline in-class performance measures, 
and two terms representing the interaction 
of baseline in-class performance with each of 
the two teacher dummy variables to control 
for teacher differences in the predictiveness 
of early in-class performance. Thus, teacher 
effects were addressed and did not threaten 
internal validity. 
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were potentially 
susceptible to 
stereotype threat' 
(p. 652). 
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APPENDIX C 

ARTICLE SCREENING AND STUDY 
QUALITY REVIEW PROTOCOLS 



Article screening protocol 

Coder (name and date): 

APA style citation: 



Initial-level screening for relevance 
Content relevance 

1. Is the article on topic? 

□ Yes 

□ No (exclude) 

2. Is the citation an empirical study (not a literature 
review, book chapter, conceptual paper, etc.)? 

□ Yes 

□ No (exclude) 

3. Is the study focused on race-based stereotype threat (as 
opposed to gender or other type of stereotype threat)? 

□ Yes 

□ No (exclude) 

Second-level screening for relevance 
Intervention-type relevance 

4. Does the study investigate interventions aimed at 
reducing stereotype threat? 

□ Yes 

□ No (exclude) 

Sample relevance 

5. Does the study include African American students? 

□ Yes 

□ No (exclude) 

6. Does the study focus on K-12 students? 

□ Yes 

□ No (exclude) 



Relevance screen summary. In order for the study to be 
included, it must have passed all relevance screens (content 
relevance, intervention-type relevance, and sample rel- 
evance). If “yes” for all six items above, the study is eligible 
for inclusion in the report if it is judged of sufficient meth- 
odological quality applying What Works Clearinghouse- 
based criteria. 



Study quality review protocol 

Study information and outcome measure 

1. Study information 

a. Reference citation (author and publication year): 



b. Title: 

c. Source: 

□ Dissertation 

□ Conference presentation 

□ Technical report 

□ Book or book chapter 

□ Journal (specify name): 

2. Adequacy of outcome measure. List the outcome 
measures and the validity and reliability evidence as 
outlined below. 

Examples of validity evidence includes test or measure 
has correlations from studies of concurrent validity, 
predictive validity, factor analysis; measure is in estab- 
lished use as an academic achievement indicator (for 
example, a state-developed standardized test adminis- 
tered as part of an annual student testing program or 
course grades on official transcripts) and thus has face 
validity as a reliable measure of student achievement. 
Examples of reliability evidence include internal con- 
sistency, test-retest, or, if measure requires judgment, 
interrater reliability. 

Quality review criteria 

3. Random assignment process. In looking at informa- 
tion included in the study on the random assignment 
process . . . (check one) 
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□ There were no reported disruptions of or con- 
taminations in random assignment process, 
and/or baseline equivalence was checked. -» 
(Acceptable) 

□ There was evidence of disruptions or contamina- 
tions in random assignment process, but they were 
minor and/or pre-test differences were checked 
and were nonsignificant -» (Acceptable with 
reservations) 

□ There was evidence of disruptions or contamina- 
tions in random assignment process, and pretest 
differences were not checked or were checked, 
were significant, and were not corrected statisti- 
cally. (Not acceptable) 

4. Attrition. In looking at information included in the 

study on attrition . . . (check one) 

□ There was no significant attrition (<20 percent 
overall). -» (Acceptable) 

□ There was significant attrition (>20 percent 
overall), but postattrition equivalence was demon- 
strated. -» (Acceptable) 

□ There was significant attrition (>20 percent over- 
all), and postattrition equivalence was not demon- 
strated. -» (Acceptable with reservations) 

□ There was no information on attrition provided, 
but degrees of freedom provide adequate informa- 
tion and indicate no significant attrition (<20 per- 
cent overall). -» (Acceptable) 

□ There was no information on attrition provided, 
but degrees of freedom provide information that 
indicates significant attrition (>20 percent overall), 
and postattrition equivalence was demonstrated. 

-* (Acceptable) 

□ There was no information on attrition provided, 
but degrees of freedom provide information that 
indicate significant attrition (>20 percent overall), 
and postattrition equivalence was not demon- 
strated. -» (Acceptable with reservations) 



□ There was no information on attrition provided, 
and degrees of freedom do not provide adequate 
information. -> (Not acceptable) 

5. Differential sample attrition. In looking at information 
included in the study on sample attrition . . . (check one) 

□ There was no significant attrition differential 
between intervention and comparison groups 
(<7 percent). -» (Acceptable) 

□ There was significant attrition differential (>7 per- 
cent), but group comparability was demonstrated. 
-* (Acceptable) 

□ There was significant attrition differential (>7 per- 
cent), and group comparability was not demon- 
strated. (Acceptable with reservations) 

□ There was no information on attrition differential 
provided, but degrees of freedom provide adequate 
information and indicate no significant attrition 
(<7 percent). (Acceptable) 

□ There was no information on attrition differential 
provided, but degrees of freedom provide informa- 
tion that indicate significant attrition (>7 percent); 
however, group comparability was demonstrated. 

(Acceptable) 

□ There was no information on attrition differential 
provided, but degrees of freedom provide informa- 
tion that indicate significant attrition (>7 percent); 
group comparability was not demonstrated. -» 
(Acceptable with reservations) 

□ There was no information on attrition differential 
provided, and degrees of freedom do not provide 
adequate information. -» (Not acceptable) 

6. Intervention contamination. Was there evidence 
of something happening after the beginning of the 
intervention that affects the outcomes for the interven- 
tion or control group (affects the outcome of one of the 
groups in an unexpected way)? (check one) 

□ No -> (Acceptable) 

□ Yes -» (Acceptable with reservations) 
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7. Confounding factor (teacher or other intervention 

delivery agent confounds). In looking at information 
included in the study on the assignment or role of 
teachers or other intervention delivery agents (e.g., 
mentors), was there any situation in which one teacher 
or other intervention delivery agent was assigned to 
just one experimental condition? (check one) 

□ No -> (Acceptable) 

□ Yes -» (Not acceptable) 



8. Summary of randomized controlled trial study quality 
review criteria. Look back at your answers for ques- 
tions 2-7 and enter the results in the following box: 



Criteria 


Acceptable 


Acceptable 

with 

reservations 


Not 

acceptable 


Q2. Adequacy 
of outcome 
measure 








Q3. Random 
assignment 
process 








Q4. Overall 
attrition 








Q5. Differential 
sample 
attrition 








Q6. Intervention 
contamination 








Q7. Confounding 
factor 









Overall study description. The following items are used to 
ensure that consistent information was gathered about each 
study. 

9. Study population sample. 

a. School district/local: 

□ Urban 

□ Suburban 

□ Rural 

□ Missing 



b. Race/ethnicity of students included (check all that 
apply): 

□ African American/Black 

□ Asian/Pacific Islander 

□ Hispanic/Latino 

□ White 

□ Other 

□ Multiracial 

c. Percentage of students receiving free or reduced- 

price lunch: 

d. Grade level (check all that apply): 

□ K-3 

□ 3-5 

□ 6-8 
□ 9-12 

e. Age (mean and/or minimum-maximum): 

f. Total sample size: 



g. Achievement outcome variable (if more than one 
treatment group, specify in parentheses which 
treatment group you are placing in each column) 



Outcome 


Achievement 
outcome variable 


Standard score mean 


Outcome 1 
(specify) 


Treatment 1 (specify) 




Treatment 2 (specify) 




Treatment 3 (specify) 




Control 




Overall 




Outcome 2 
(specify) 


Treatment 1 (specify) 




Treatment 2 (specify) 




Treatment 3 (specify) 




Control 




Overall 









34 



REDUCING STEREOTYPE THREAT IN CLASSROOMS 



Independent variable/intervention 

10. Intervention description 

a. Briefly describe the intervention, including the 
stated purpose and any required special condi- 
tions or resources: 



b. Subject area (check all that apply): 

□ English/language arts 
U Math 

□ Social studies 

□ Science 

U Other (specify course title) 



c. Duration of intervention: 

Analysis and results. 

11. Unit of assignment and analysis match. 

a. Was there a match between unit of assignment 
and analysis? (Check one) 

□ Matched, both were students 

□ Matched, both were teachers 

□ Matched, both were schools 

□ Not matched, not addressed in analyses, 
group differences not statistically significant 

□ Not matched, not addressed in analyses, 
group differences statistically significant 

□ Not matched, but addressed in analyses 

Explain: 



b. Was an effect size reported? 

□ No 

□ Yes (specify pages) 



12. Results. Please fill in the following table for each out- 
come included in the study 



Outcome 


Statistic 


Notes 


Page 

numbers 


Outcome 1 
(specify) 


Mean/count/ 
proportions 
(include both 
treatment and 
control statistics) 






Sample size 
(include both 
treatment and 
control statistics) 






Standard 
deviation 
(include both 
treatment and 
control statistics) 






Test statistic 






Were group 
differences 
statistically 
significant? 
(provide p-value) 






Researcher 
reported 
effect size 
(including type, if 
available) 






Outcome 2 
(specify) 


Mean/count/ 
proportions 
(include both 
treatment and 
control statistics) 






Sample size 
(include both 
treatment and 
control statistics) 






Standard 

deviation (include 
both treatment 
and control 
statistics) 






Test statistic 






Were groups 
differences 
statistically 
significant 
(provide p-value)? 






Researcher 
reported effect 
size (including 
type, if available) 
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